
Alibaba Cloud ยท Chat / LLM ยท 235B Parameters (22B Active) ยท 128K Context

Streaming Reasoning Long Context Multilingual Code Structured OutputOverview
Qwen3 Max is Alibaba Cloudโs most powerful model in the Qwen3 series, featuring a 235B Sparse Mixture-of-Experts Transformer with 22B parameters active per forward pass. Developed by Alibaba Cloud โ the cloud computing arm of Alibaba Group and creator of the Qwen model family โ it delivers frontier-level performance in complex reasoning, multilingual tasks, long-context understanding, and advanced coding, rivaling GPT-4o and Claude Sonnet on major benchmarks. With 128K context, 29+ languages, and a hybrid thinking mode, Qwen3 Max is built for demanding enterprise workloads. Served instantly via the Qubrid AI Serverless API.๐ 235B MoE. Rivals GPT-4o and Claude Sonnet. 29+ languages. 128K context. Access via Qubrid AI โ no DashScope setup required.
Model Specifications
| Field | Details |
|---|---|
| Model ID | Qwen/Qwen3-Max |
| Provider | Alibaba Cloud (Qwen Team) |
| Kind | Chat / LLM |
| Architecture | Sparse Mixture-of-Experts (MoE) Transformer โ 235B total / 22B active per token |
| Parameters | 235B total (22B active per forward pass) |
| Context Length | 128,000 Tokens |
| MoE | No |
| Release Date | April 2025 |
| License | Proprietary โ Alibaba Cloud DashScope API only |
| Training Data | Large-scale multilingual pretraining corpus with RLHF post-training (not publicly disclosed) |
| Function Calling | Not Supported |
| Image Support | N/A |
| Serverless API | Available |
| Fine-tuning | Coming Soon |
| On-demand | Coming Soon |
| State | ๐ข Ready |
Pricing
๐ณ Access via the Qubrid AI Serverless API with pay-per-token pricing. No infrastructure management required.
| Token Type | Price per 1M Tokens |
|---|---|
| Input Tokens | $1.20 |
| Input Tokens (Cached) | $0.24 |
| Output Tokens | $6.00 |
Quickstart
Prerequisites
- Create a free account at platform.qubrid.com
- Generate your API key from the API Keys section
- Replace
QUBRID_API_KEYin the code below with your actual key
Python
JavaScript
Go
cURL
Live Example
Prompt: Write a short story about a robot learning to paint
Response:
Playground Features
The Qubrid AI Playground lets you interact with Qwen3 Max directly in your browser โ no setup, no code, no cost to explore.๐ง System Prompt
Define the modelโs role, language, reasoning depth, and output format before the conversation begins โ ideal for enterprise assistants, multilingual workflows, and structured analysis pipelines.Set your system prompt once in the Qubrid Playground and it applies across every turn of the conversation.
๐ฏ Few-Shot Examples
Guide the modelโs output format and reasoning style with concrete examples โ no fine-tuning, no retraining required.| User Input | Assistant Response |
|---|---|
Translate and summarize this paragraph in Spanish | [Translated summary in Spanish, preserving key facts and tone of the original] |
Review this code and suggest improvements | Issues found: 1) O(nยฒ) loop on line 12 โ replace with hash map for O(n). 2) Missing null check on line 7. 3) Variable name 'x' is ambiguous โ rename to 'user_count' for clarity. |
๐ก Add few-shot examples in the Qubrid Playground to establish preferred output language, format, and domain focus โ no fine-tuning required.
Inference Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Streaming | boolean | true | Enable streaming responses for real-time output |
| Temperature | number | 0.7 | Controls creativity and randomness. Higher values produce more diverse output |
| Max Tokens | number | 4096 | Maximum number of tokens the model can generate |
| Top P | number | 1 | Controls nucleus sampling for more predictable output |
Use Cases
- Complex multi-step reasoning
- Advanced coding and debugging
- Research and analytical writing
- Long-document summarization
- Multilingual chat and translation
- Enterprise chatbots and assistants
Strengths & Limitations
| Strengths | Limitations |
|---|---|
| 235B MoE architecture โ frontier-level intelligence with 22B active per token | Closed-source โ no self-hosting or weight access |
| Rivals GPT-4o and Claude Sonnet on key reasoning and coding benchmarks | Higher latency than smaller Qwen models |
| Up to 128K context window for long-document workflows | Higher cost per token vs open-source alternatives |
| Strong multilingual performance across 29+ languages | Function calling not supported |
| Excellent structured output and instruction following | |
| Hybrid thinking mode for complex reasoning tasks |
Why Qubrid AI?
- ๐ No DashScope setup required โ access Qwen3 Max directly via the Qubrid AI Serverless API with a single API key
- ๐ OpenAI-compatible โ drop-in replacement using the same SDK, just swap the base URL
- ๐ฐ Cached input pricing โ $0.24/1M for cached tokens, reducing costs significantly on repeated long-context workloads
- ๐งช Built-in Playground โ prototype with system prompts and few-shot examples instantly at platform.qubrid.com
- ๐ Full observability โ API logs and usage tracking built into the Qubrid dashboard
- ๐ Multi-language support โ Python, JavaScript, Go, cURL out of the box
Resources
| Resource | Link |
|---|---|
| ๐ Qubrid Docs | docs.platform.qubrid.com |
| ๐ฎ Playground | Try Qwen3 Max live |
| ๐ API Keys | Get your API Key |
| ๐ค Hugging Face | Qwen/Qwen3-Max |
| ๐ฌ Discord | Join the Qubrid Community |
Built with โค๏ธ by Qubrid AI
Frontier models. Serverless infrastructure. Zero friction.
Frontier models. Serverless infrastructure. Zero friction.